Efficient incremental mining of contrast patterns in changing data
نویسندگان
چکیده
A contrast pattern, also known as an emerging pattern [7], is an itemset whose frequency differs significantly between two classes of data. Such patterns describe differences between datasets and have been shown to be useful for building powerful classifiers [11, 9, 2, 8] . Incrementally mining them in changing data is very important, where transactions can be inserted and deleted and mining needs to be repeated after changes occur. When the changes are small, the previously mined contrast patterns should be reused where possible, to compute the new patterns. A primary example of changing data is a data stream a sequence of continuously arriving transactions (or itemsets). Mining of contrast patterns in a data stream is useful for stream classification [2] and network traffic change detection [4]. Work in [10] presented an algorithm to incrementally mine contrast patterns, but is oriented to updates of a single type. When a dataset changes due to insertion and deletion together, the efficiency of [10]’s approach is reduced, due to redundant computations. In this paper, we present a new algorithm that addresses the scenario of incrementally mining contrast patterns in response to simultaneous insertion and deletion. Our ap-
منابع مشابه
Incremental update on sequential patterns in large databases
Mining of sequential patterns in a transactional database is time-consuming due to its complexity. While maintaining present patterns is a non-trivial task after database update, since appended data sequences may invalidate old patterns and create new ones. In contrast to re-mining, the incremental update algorithm proposed which effectively utilizes discovered knowledge is the key to improve m...
متن کاملIncremental Mining for Frequent Patterns in Evolving Time Series Datatabases
Several emerging applications warrant mining and discovering hidden frequent patterns in time series databases, e.g., sensor networks, environment monitoring, and inventory stock monitoring. Time series databases are characterized by two features: (1) The continuous arrival of data and (2) the time dimension. These features raise new challenges for data mining such as the need for online proces...
متن کاملMining Closed-Regular Patterns in Incremental Transactional Databases using Vertical Data Format
Regular pattern mining on Incremental Databases is a novel approach in Data Mining Research. Recently closed item set mining has gained lot of consideration in mining process. In this paper we propose a new mining method called CRPMID (Closed-regular Pattern Mining on Incremental Databases) with sliding window technique using Vertical Data format. This method generates complete set of closed-re...
متن کاملIncremental Mining for Regular Frequent Patterns in Vertical Format
In the real world database updates continuously in several online applications like super market, network monitoring, web administration, stock market etc. Frequent pattern mining is a fundamental and essential area in data mining research. Not only occurrence frequency of a pattern but also occurrence behaviour of a pattern may be treated as important criteria to measure the interestingness of...
متن کاملHigh Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences
Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Process. Lett.
دوره 110 شماره
صفحات -
تاریخ انتشار 2010